Fix #8926: ListSerializer supports instance access during validation for many=True by zainnadeem786 · Pull Request #9879 · encode/django-rest-framework

zainnadeem786 · 2026-01-25T17:36:40Z

Summary

This PR fixes issue #8926 by updating ListSerializer to preserve and provide access to self.instance during validation when many=True. Previously, child serializers in bulk updates could not access their corresponding instance, causing AssertionErrors and inconsistent behavior. This update ensures that each item in a list serializer automatically matches its input data to the correct instance using id or pk.

Key Enhancements

Automated Instance Matching
- ListSerializer.run_child_validation now attempts to match input data to items in self.instance.
- Builds an instance map for O(1) lookup during validation.
- Supports subclasses that may override instance assignments.
Validation Fixes
- Avoided premature access to validated_data by returning run_validation results directly.
- Manually restores instance and initial_data in deepcopied child serializers.
- Partial updates (partial=True) correctly propagate from root serializer to list items.
- Standardized error reporting in to_internal_value for positional list errors.
Test Suite Updates
- Updated 37 tests in tests/test_serializer_lists.py to reflect consistent validation and instance matching behavior.
- Added regression test test_many_true_regression_8926 to confirm that validate_<field> methods can now access self.instance during bulk updates.

Verification

Ran pytest tests/test_serializer_lists.py → all 37 tests passed.
Confirmed that individual list items now correctly reference their associated instance during validation.
Verified correct handling of allow_empty, min_length, max_length, and nested serializers.

Notes

This PR does not change the public API of ListSerializer.
It improves reliability and consistency for serializers using many=True, particularly for update operations.

Related Issues

Fixes Invalid self.instance when validating the serializer using many=True #8926

…ng validation and passes all tests

Copilot

Pull request overview

This PR addresses issue #8926 by implementing automated instance matching in ListSerializer for bulk validation operations with many=True. The core enhancement allows child serializers to access their corresponding instance during validation by automatically matching input data to instances using id or pk fields.

Changes:

Automated instance-to-data matching in ListSerializer.run_child_validation using a pk-based lookup map
Enhanced error handling with consistent ErrorDetail wrapping for validation errors
Updated test suite with new regression test and corrected assertions for validation behavior

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 15 comments.

File	Description
rest_framework/serializers.py	Core changes to ListSerializer adding automated instance matching, improved error handling, and validation flow updates
tests/test_serializer_lists.py	Updated existing tests for consistency and added regression test for issue #8926
.gitignore	Added venv/ to ignored paths

Comments suppressed due to low confidence (1)

rest_framework/serializers.py:729

Several docstrings and inline comments were removed (e.g., for get_value, run_validation, to_internal_value, to_representation methods). While the code may be self-documenting to some extent, these comments provided useful context about the purpose and behavior of these methods. Consider keeping at least the docstrings for public methods to maintain API documentation quality, especially since this is a framework used by many developers.

    def get_value(self, dictionary):
        if html.is_html_input(dictionary):
            return html.parse_html_list(dictionary, prefix=self.field_name, default=empty)
        return dictionary.get(self.field_name, empty)

    def run_validation(self, data=empty):
        is_empty_value, data = self.validate_empty_values(data)
        if is_empty_value:
            return data

        value = self.to_internal_value(data)
        try:
            self.run_validators(value)
            value = self.validate(value)
            assert value is not None, '.validate() should return the validated data'
        except (ValidationError, DjangoValidationError) as exc:
            raise ValidationError(detail=as_serializer_error(exc))

        return value

    def run_child_validation(self, data):
        child = copy.deepcopy(self.child)
        if getattr(self, 'partial', False) or getattr(self.root, 'partial', False):
            child.partial = True

        # Field.__deepcopy__ re-instantiates the field, wiping any state.
        # If the subclass set an instance or initial_data on self.child,
        # we manually restore them to the deepcopied child.
        child_instance = getattr(self.child, 'instance', None)
        if child_instance is not None and child_instance is not self.instance:
            child.instance = child_instance
        elif hasattr(self, '_instance_map') and isinstance(data, dict):
            # Automated instance matching (#8926)
            data_pk = data.get('id') or data.get('pk')
            if data_pk is not None:
                child.instance = self._instance_map.get(str(data_pk))
            else:
                child.instance = None
        else:
            child.instance = None

        child_initial_data = getattr(self.child, 'initial_data', empty)
        if child_initial_data is not empty:
            child.initial_data = child_initial_data
        else:
            # Set initial_data for item-level validation if not already set.
            child.initial_data = data

        validated = child.run_validation(data)
        return validated

    def to_internal_value(self, data):
        if html.is_html_input(data):
            data = html.parse_html_list(data, default=[])

        if not isinstance(data, list):
            raise ValidationError({
                api_settings.NON_FIELD_ERRORS_KEY: [
                    self.error_messages['not_a_list'].format(input_type=type(data).__name__)
                ]
            })

        if not self.allow_empty and len(data) == 0:
            raise ValidationError({
                api_settings.NON_FIELD_ERRORS_KEY: [ErrorDetail(self.error_messages['empty'], code='empty')]
            })

        if self.max_length is not None and len(data) > self.max_length:
            raise ValidationError({
                api_settings.NON_FIELD_ERRORS_KEY: [ErrorDetail(self.error_messages['max_length'].format(max_length=self.max_length), code='max_length')]
            })

        if self.min_length is not None and len(data) < self.min_length:
            raise ValidationError({
                api_settings.NON_FIELD_ERRORS_KEY: [ErrorDetail(self.error_messages['min_length'].format(min_length=self.min_length), code='min_length')]
            })

        # Build a primary key mapping for instance updates (#8926)
        instance_map = {}
        if self.instance is not None:
            if isinstance(self.instance, Mapping):
                instance_map = {str(k): v for k, v in self.instance.items()}
            elif hasattr(self.instance, '__iter__'):
                for obj in self.instance:
                    pk = getattr(obj, 'pk', getattr(obj, 'id', None))
                    if pk is not None:
                        instance_map[str(pk)] = obj

        self._instance_map = instance_map

        try:
            ret = []
            errors = []

            for item in data:
                try:
                    validated = self.run_child_validation(item)
                except ValidationError as exc:
                    errors.append(exc.detail)
                else:
                    ret.append(validated)
                    errors.append({})

            if any(errors):
                raise ValidationError(errors)

            return ret
        finally:
            delattr(self, '_instance_map')

    def to_representation(self, data):
        # Dealing with nested relationships, data can be a Manager,
        # so, first get a queryset from the Manager if needed.
        # We avoid .all() on QuerySets to preserve Issue #2704 behavior.
        iterable = data.all() if isinstance(data, models.manager.BaseManager) else data

        return [
            self.child.to_representation(item) for item in iterable
        ]

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-24T15:22:39Z

rest_framework/serializers.py

+            elif hasattr(self.instance, '__iter__'):
+                for obj in self.instance:
+                    pk = getattr(obj, 'pk', getattr(obj, 'id', None))
+                    if pk is not None:
+                        instance_map[str(pk)] = obj


The instance mapping logic checks hasattr(self.instance, '__iter__') to handle iterables. However, strings are also iterable in Python, which could cause unexpected behavior if a string is accidentally passed as instance. Consider adding an additional check to ensure self.instance is not a string, or explicitly check for list/tuple/QuerySet types.

Copilot · 2026-02-24T15:22:39Z

rest_framework/serializers.py

    def save(self, **kwargs):
-        """
-        Save and return a list of object instances.
-        """
-        # Guard against incorrect use of `serializer.save(commit=False)`
-        assert 'commit' not in kwargs, (
-            "'commit' is not a valid keyword argument to the 'save()' method. "
-            "If you need to access data before committing to the database then "
-            "inspect 'serializer.validated_data' instead. "
-            "You can also pass additional keyword arguments to 'save()' if you "
-            "need to set extra attributes on the saved model instance. "
-            "For example: 'serializer.save(owner=request.user)'.'"
-        )
-
-        validated_data = [
-            {**attrs, **kwargs} for attrs in self.validated_data
-        ]
+        assert hasattr(self, 'validated_data'), "Call `.is_valid()` before `.save()`."
+        validated_data = [{**item, **kwargs} for item in self.validated_data]

        if self.instance is not None:
            self.instance = self.update(self.instance, validated_data)
-            assert self.instance is not None, (
-                '`update()` did not return an object instance.'
-            )
        else:
            self.instance = self.create(validated_data)
-            assert self.instance is not None, (
-                '`create()` did not return an object instance.'
-            )
-
        return self.instance


The ListSerializer.save() method overrides BaseSerializer.save() but removes several important assertions that exist in the parent class: the 'commit' keyword check, the check preventing save() after accessing data, and the checks for errors. This means users could potentially call save(commit=False) on a ListSerializer or save after accessing .data without getting the helpful error messages from BaseSerializer. Consider re-adding these checks or calling super().save() if possible to maintain consistency with the parent class behavior.

rest_framework/serializers.py

Copilot · 2026-02-24T15:22:40Z

tests/test_serializer_lists.py

+def test_many_true_instance_level_validation_guidance():
+    class Obj:
+        def __init__(self, valid):
+            self.valid = valid
+
+    class TestSerializer(serializers.Serializer):
+        status = serializers.CharField()
+
+        def validate_status(self, value):
+            if self.instance is None:
+                # Provide guidance if user tries to use instance-level validation with many=True
+                raise serializers.ValidationError(
+                    "You tried to access self.instance in a many=True update, "
+                    "but it is not set by default. Override run_child_validation "
+                    "to set the individual instance."
+                )
+            if not self.instance.valid:
+                raise serializers.ValidationError("Invalid instance")
+            return value
+
+    objs = [Obj(True), Obj(False)]
+
+    serializer = TestSerializer(
+        instance=objs,
+        data=[{"status": "ok"}, {"status": "fail"}],
+        many=True,
+        partial=True,
+    )
+
+    with pytest.raises(serializers.ValidationError) as exc:
+        serializer.is_valid(raise_exception=True)
+
+    assert "run_child_validation" in str(exc.value)


The test test_many_true_instance_level_validation_guidance creates objects without 'id' or 'pk' attributes, so the automated instance matching won't occur and self.instance will remain None. This means the test correctly validates that when instances don't have id/pk attributes, the validation guidance message is raised. However, the test name and intent suggest it's demonstrating a pattern that users should follow, when actually the PR introduces automated matching that eliminates the need for this pattern in most cases. Consider renaming this test or adding a comment to clarify it's testing the edge case where objects don't have id/pk attributes.

Copilot · 2026-02-24T15:22:40Z

rest_framework/serializers.py

-            else:
-                ret.append(validated)
-                errors.append({})
+        self._instance_map = instance_map


The _instance_map is stored as an instance attribute during validation. If the same ListSerializer instance is used concurrently (e.g., in multiple threads), this could lead to race conditions where one thread's instance map overwrites another's. While Django REST Framework serializers are typically instantiated per-request, this could be an issue in certain deployment scenarios. Consider documenting that ListSerializer instances should not be shared across threads during validation, or adding thread-safety mechanisms if concurrent usage is expected.

can you please cross check this?

rest_framework/serializers.py

Copilot · 2026-02-24T15:22:43Z

rest_framework/serializers.py

        if self.instance is not None:
            self.instance = self.update(self.instance, validated_data)


The assertion assert self.instance is not None, ('update() did not return an object instance.') was removed from the save method. This removes an important safety check that ensures update() returns a valid instance. Without this check, if update() returns None, self.instance will be set to None without any warning, which could lead to silent failures. Consider keeping this assertion for defensive programming.

rest_framework/serializers.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

auvipy

can you please cross check the suggestions?

…ions, standardize errors

zainnadeem786 · 2026-02-24T18:15:21Z

Pushed follow-up fixes addressing the review feedback:

Standardized not_a_list error output using ErrorDetail.

Updated instance iterable checks to use explicit types (list, tuple, QuerySet).

Preserved ListSerializer.save() safety assertions, including is_valid checks, invalid-data handling, .data access, and non-None create/update guarantees.

Made _instance_map cleanup defensive.

Documented duplicate-key behavior in instance mapping (last-write-wins semantics).

Validation performed locally:

tests/test_serializer_lists.py

Issue #2704 regression test

Full test suite

All tests passing.

rest_framework/serializers.py

zainnadeem786 · 2026-02-25T04:41:23Z

Thanks for the review. I’ve now restored the unintended docstring regressions in [serializers.py]

What was fixed
Restored as_serializer_error docstring to exactly match current upstream/main.
Restored raise_errors_on_nested_writes docstring to exactly match current upstream/main.

Scope

This change is docstring-only.
No functional ListSerializer logic was modified.
No unrelated sections from #9870 were changed.

Validation

Ran full test suite locally: all passing.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rest_framework/serializers.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

zainnadeem786 closed this Jan 25, 2026

zainnadeem786 reopened this Jan 25, 2026

zainnadeem786 force-pushed the improve-many-true-validation-guidance branch from d42540b to c205e9f Compare January 25, 2026 18:22

Fix encode#8926: ListSerializer preserves instance for many=True duri…

f0375ca

…ng validation and passes all tests

zainnadeem786 force-pushed the improve-many-true-validation-guidance branch from c205e9f to f0375ca Compare January 25, 2026 18:58

auvipy requested review from auvipy, Copilot and peterthomassen and removed request for auvipy, Copilot and peterthomassen February 24, 2026 15:14

Copilot started reviewing on behalf of auvipy February 24, 2026 15:15 View session

Copilot AI reviewed Feb 24, 2026

View reviewed changes

Update rest_framework/serializers.py

ac82e50

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

auvipy requested changes Feb 24, 2026

View reviewed changes

auvipy and others added 5 commits February 24, 2026 21:36

Merge branch 'main' into improve-many-true-validation-guidance

07de4b8

Fix encode#8926 with minimal ListSerializer instance matching changes

c402a57

Keep virtualenv ignored in .gitignore

66b8012

Fix Copilot/auvipy review: safe iterable check, restore save() assert…

90e1a24

…ions, standardize errors

Refine ListSerializer review follow-ups and cleanup

0acf49a

zainnadeem786 requested a review from auvipy February 24, 2026 18:15

browniebroke reviewed Feb 24, 2026

View reviewed changes

rest_framework/serializers.py Show resolved Hide resolved

rest_framework/serializers.py Outdated Show resolved Hide resolved

Restore serializers docstrings from upstream main

1484520

zainnadeem786 requested a review from browniebroke February 25, 2026 04:41

Update rest_framework/serializers.py

ef7e976

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

auvipy requested a review from Copilot February 25, 2026 08:54

Copilot started reviewing on behalf of auvipy February 25, 2026 08:54 View session

Merge branch 'main' into improve-many-true-validation-guidance

c665595

Copilot AI reviewed Feb 25, 2026

View reviewed changes

rest_framework/serializers.py Outdated Show resolved Hide resolved

rest_framework/serializers.py Outdated Show resolved Hide resolved

zainnadeem786 and others added 2 commits February 25, 2026 14:14

Remove unreachable return in ListSerializer.to_internal_value

b61b472

Update rest_framework/serializers.py

22caa96

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

		if self.instance is not None:
		self.instance = self.update(self.instance, validated_data)

Uh oh!

Comments

Conversation

zainnadeem786 commented Jan 25, 2026

Summary

Key Enhancements

Verification

Notes

Related Issues

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

auvipy Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

auvipy left a comment

Choose a reason for hiding this comment

Uh oh!

zainnadeem786 commented Feb 24, 2026

Uh oh!

Uh oh!

Uh oh!

zainnadeem786 commented Feb 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants